Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HBASE-28686 MapReduceBackupCopyJob should support custom DistCp options #6017

Merged
merged 2 commits into from
Jul 18, 2024

Conversation

rmdmattingly
Copy link
Contributor

Problem

The MapReduceBackupCopyJob class provides no means for updating DistCp job options. This means that you're stuck with defaults, which isn't always desirable. For example, my workplace would like the freedom to deviate from at least two DistCp defaults:

distcp.direct.write — we would like to set this to true, because writing and renaming tmp files is expensive in S3 (where we store our backups).
we would also like control over the number of mappers that DistCp will run

Proposed Solution

It is not the prettiest solution, but I'm proposing that we support DistCp customizations via the given backup client configuration like this. It's necessary to do this conf -> arg conversion because we still want to use DistCp's run method, which expects args, so as to not change any error codes. Hadoop actually does something similar, but in the opposite direction — the DistCp job has logic to convert the args back to configurations (lol).

Further, the DistCp API is really unfortunately designed for programmatic use, so it doesn't leave us great alternatives. For example, it doesn't matter what you pass in as DistCpOptions to the constructor if you use the run method, your options will be overwritten based on the args that you pass in. Alternatively, if you pass in the DistCpOptions in the constructor and use DistCp#execute or DistCp#createAndSubmitJob, then you get none of the error specificity!

@ndimiduk @charlesconnell @hgromer

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@rmdmattingly
Copy link
Contributor Author

I believe these build failures are just noise, and will be fixed by a change that I've added to another PR: #6018 (comment)

@rmdmattingly rmdmattingly requested a review from Apache9 July 9, 2024 19:51
@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase

This comment has been minimized.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 45s Docker mode activated.
_ Prechecks _
+1 💚 dupname 0m 0s No case conflicting files found.
+0 🆗 codespell 0m 0s codespell was not available.
+0 🆗 detsecrets 0m 0s detect-secrets was not available.
+1 💚 @author 0m 0s The patch does not contain any @author tags.
+1 💚 hbaseanti 0m 0s Patch does not have any anti-patterns.
_ master Compile Tests _
+1 💚 mvninstall 4m 43s master passed
+1 💚 compile 0m 35s master passed
+1 💚 checkstyle 0m 11s master passed
+1 💚 spotbugs 0m 35s master passed
+1 💚 spotless 0m 49s branch has no errors when running spotless:check.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 6s the patch passed
+1 💚 compile 0m 30s the patch passed
+1 💚 javac 0m 30s the patch passed
+1 💚 blanks 0m 0s The patch has no blanks issues.
+1 💚 checkstyle 0m 10s the patch passed
+1 💚 spotbugs 0m 40s the patch passed
+1 💚 hadoopcheck 11m 28s Patch does not cause any errors with Hadoop 3.3.6 3.4.0.
+1 💚 spotless 0m 44s patch has no errors when running spotless:check.
_ Other Tests _
+1 💚 asflicense 0m 8s The patch does not generate ASF License warnings.
31m 36s
Subsystem Report/Notes
Docker ClientAPI=1.43 ServerAPI=1.43 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6017/6/artifact/yetus-general-check/output/Dockerfile
GITHUB PR #6017
Optional Tests dupname asflicense javac spotbugs checkstyle codespell detsecrets compile hadoopcheck hbaseanti spotless
uname Linux 143b26866bc3 5.4.0-1103-aws #111~18.04.1-Ubuntu SMP Tue May 23 20:04:10 UTC 2023 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / bee79dc
Default Java Eclipse Adoptium-17.0.11+9
Max. process+thread count 83 (vs. ulimit of 30000)
modules C: hbase-backup U: hbase-backup
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6017/6/console
versions git=2.34.1 maven=3.9.8 spotbugs=4.7.3
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@Apache-HBase
Copy link

🎊 +1 overall

Vote Subsystem Runtime Logfile Comment
+0 🆗 reexec 0m 46s Docker mode activated.
-0 ⚠️ yetus 0m 3s Unprocessed flag(s): --brief-report-file --spotbugs-strict-precheck --author-ignore-list --blanks-eol-ignore-file --blanks-tabs-ignore-file --quick-hadoopcheck
_ Prechecks _
_ master Compile Tests _
+1 💚 mvninstall 4m 0s master passed
+1 💚 compile 0m 24s master passed
+1 💚 javadoc 0m 15s master passed
+1 💚 shadedjars 6m 34s branch has no errors when building our shaded downstream artifacts.
_ Patch Compile Tests _
+1 💚 mvninstall 3m 53s the patch passed
+1 💚 compile 0m 21s the patch passed
+1 💚 javac 0m 21s the patch passed
+1 💚 javadoc 0m 17s the patch passed
+1 💚 shadedjars 6m 58s patch has no errors when building our shaded downstream artifacts.
_ Other Tests _
+1 💚 unit 11m 2s hbase-backup in the patch passed.
35m 37s
Subsystem Report/Notes
Docker ClientAPI=1.46 ServerAPI=1.46 base: https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6017/6/artifact/yetus-jdk17-hadoop3-check/output/Dockerfile
GITHUB PR #6017
Optional Tests javac javadoc unit compile shadedjars
uname Linux a7ff1dbf7fe1 5.4.0-182-generic #202-Ubuntu SMP Fri Apr 26 12:29:36 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
Build tool maven
Personality dev-support/hbase-personality.sh
git revision master / bee79dc
Default Java Eclipse Adoptium-17.0.11+9
Test Results https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6017/6/testReport/
Max. process+thread count 3440 (vs. ulimit of 30000)
modules C: hbase-backup U: hbase-backup
Console output https://ci-hbase.apache.org/job/HBase-PreCommit-GitHub-PR/job/PR-6017/6/console
versions git=2.34.1 maven=3.9.8
Powered by Apache Yetus 0.15.0 https://yetus.apache.org

This message was automatically generated.

@ndimiduk ndimiduk merged commit b4cbd5c into apache:master Jul 18, 2024
1 check passed
@ndimiduk ndimiduk deleted the HBASE-28686 branch July 18, 2024 11:45
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Jul 18, 2024
…ns (apache#6017)

Co-authored-by: Ray Mattingly <[email protected]>
Signed-off-by: Duo Zhang <[email protected]>
Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Jul 18, 2024
…ns (apache#6017)

Co-authored-by: Ray Mattingly <[email protected]>
Signed-off-by: Duo Zhang <[email protected]>
Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Jul 18, 2024
…ns (apache#6017)

Co-authored-by: Ray Mattingly <[email protected]>
Signed-off-by: Duo Zhang <[email protected]>
Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit that referenced this pull request Jul 18, 2024
…ns (#6017)

Co-authored-by: Ray Mattingly <[email protected]>
Signed-off-by: Duo Zhang <[email protected]>
Signed-off-by: Nick Dimiduk <[email protected]>
rmdmattingly added a commit to HubSpot/hbase that referenced this pull request Jul 19, 2024
…ns (apache#6017)

Co-authored-by: Ray Mattingly <[email protected]>
Signed-off-by: Duo Zhang <[email protected]>
Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit that referenced this pull request Jul 22, 2024
…ns (#6017)

Co-authored-by: Ray Mattingly <[email protected]>
Signed-off-by: Duo Zhang <[email protected]>
Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit to ndimiduk/hbase that referenced this pull request Jul 22, 2024
…ns (apache#6017)

Co-authored-by: Ray Mattingly <[email protected]>
Signed-off-by: Duo Zhang <[email protected]>
Signed-off-by: Nick Dimiduk <[email protected]>
ndimiduk pushed a commit that referenced this pull request Jul 22, 2024
…ns (#6017)

Co-authored-by: Ray Mattingly <[email protected]>
Signed-off-by: Duo Zhang <[email protected]>
Signed-off-by: Nick Dimiduk <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants